Sparsity and persistence in time-frequency sound representations

نویسندگان

Matthieu Kowalski

Bruno Torrésani

Vivek K. Goyal

Manos Papadakis

Dimitri Van De Ville

چکیده

It is a well known fact that the time-frequency domain is very well adapted for representing audio signals. The main two features of time-frequency representations of many classes of audio signals are sparsity (signals are generally well approximated using a small number of coefficients) and persistence (significant coefficients are not isolated, and tend to form clusters). This contribution presents signal approximation algorithms that exploit these properties, in the framework of hierarchical probabilistic models. Given a time-frequency frame (i.e. a Gabor frame, or a union of several Gabor frames or time-frequency bases), coefficients are first gathered into groups. A group of coefficients is then modeled as a random vector, whose distribution is governed by a hidden state associated with the group. Algorithms for parameter inference and hidden state estimation from analysis coefficients are described. The role of the chosen dictionary, and more particularly its structure, is also investigated. The proposed approach bears some resemblance with variational approaches previously proposed by the authors (in particular the variational approach exploiting mixed norms based regularization terms). In the framework of audio signal applications, the time-frequency frame under consideration is a union of two MDCT bases or two Gabor frames, in order to generate estimates for tonal and transient layers. Groups corresponding to tonal (resp. transient) coefficients are constant frequency (resp. constant time) time-frequency coefficients of a frequency-selective (resp. time-selective) MDCT basis or Gabor frame.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of frequency-warped representations for source separation of stereo mixtures

We evaluate the use of different frequency-warped, nonuniform time-frequency representations for the purpose of sound source separation from stereo mixtures. Such transformations enhance frequency resolution in spectral areas relevant for the discrimination of the different sources, improving sparsity and mixture disjointness. In this paper, we study the effect of using such representations on ...

متن کامل

Auditory Sketches: Very Sparse Representations of Sounds Are Still Recognizable

Sounds in our environment like voices, animal calls or musical instruments are easily recognized by human listeners. Understanding the key features underlying this robust sound recognition is an important question in auditory science. Here, we studied the recognition by human listeners of new classes of sounds: acoustic and auditory sketches, sounds that are severely impoverished but still reco...

متن کامل

اثر صدا با فرکانس های مختلف بر توجه انتخابی و زمان واکنش انسان

Background and aims: Sound is one of the most effective exogenous factors affecting brain processing mechanisms, including attention that  affecting human error and occupational accidents. The purpose of this study was to investigate the effect of sound frequency on noise annoiance, selective attention and human response time. Methods: This research is an interventional study that was con...

متن کامل

A sparsity-perspective to quadratic time-frequency distributions

We examine nonstationary signals within the framework of compressive sensing and sparse reconstruction. Most of these signals, which arise in numerous applications, exhibit small relative occupancy in the time-frequency domain, casting them as sparse in a joint-variable representation. We present two general approaches to incorporate sparsity into time-frequency analysis, leading to what we ref...

متن کامل

Persistence Codebooks for Topological Data Analysis

Topological data analysis, such as persistent homology has shown beneficial properties for machine learning in many tasks. Topological representations, such as the persistence diagram (PD), however, have a complex structure (multiset of intervals) which makes it difficult to combine with typical machine learning workflows. We present novel compact fixed-size vectorial representations of PDs bas...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Sparsity and persistence in time-frequency sound representations

نویسندگان

چکیده

منابع مشابه

Comparison of frequency-warped representations for source separation of stereo mixtures

Auditory Sketches: Very Sparse Representations of Sounds Are Still Recognizable

اثر صدا با فرکانس های مختلف بر توجه انتخابی و زمان واکنش انسان

A sparsity-perspective to quadratic time-frequency distributions

Persistence Codebooks for Topological Data Analysis

عنوان ژورنال:

اشتراک گذاری